AITopics | utilize shaping reward

Learning to Utilize Shaping Rewards: A New Approach of Reward Shaping

Neural Information Processing SystemsDec-24-2025, 12:05:55 GMT

Reward shaping is an effective technique for incorporating domain knowledge into reinforcement learning (RL). Existing approaches such as potential-based reward shaping normally make full use of a given shaping reward function. However, since the transformation of human knowledge into numeric reward values is often imperfect due to reasons such as human cognitive bias, completely utilizing the shaping reward function may fail to improve the performance of RL algorithms. In this paper, we consider the problem of adaptively utilizing a given shaping reward function. We formulate the utilization of shaping rewards as a bi-level optimization problem, where the lower level is to optimize policy using the shaping rewards and the upper level is to optimize a parameterized shaping weight function for true reward maximization. We formally derive the gradient of the expected true reward with respect to the shaping weight function parameters and accordingly propose three learning algorithms based on different assumptions. Experiments in sparse-reward cartpole and MuJoCo environments show that our algorithms can fully exploit beneficial shaping rewards, and meanwhile ignore unbeneficial shaping rewards or even transform them into beneficial ones.

name change, new approach, utilize shaping reward, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.60)

Add feedback

Review for NeurIPS paper: Learning to Utilize Shaping Rewards: A New Approach of Reward Shaping

Neural Information Processing SystemsJan-27-2025, 20:57:53 GMT

This paper proposes a method to learn shaping rewards in RL to improve learning. The authors clearly explain the problem and their method. The experimental results show clearly their method working as intended. I would expect the authors to update the final draft of their manuscript with the additional experiments provided in the author response and referencing and discussing the relation of their method to crucial pieces of prior work suggested by reviewers, in particular "Human-level performance in 3D multiplayer games with population-based reinforcement learning" which also performs bi-level optimisation of shaping rewards.

neurips paper, reward shaping, utilize shaping reward, (2 more...)

Neural Information Processing Systems

Genre: Research Report (0.74)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.34)

Add feedback

Learning to Utilize Shaping Rewards: A New Approach of Reward Shaping

Neural Information Processing SystemsOct-11-2024, 04:38:36 GMT

Reward shaping is an effective technique for incorporating domain knowledge into reinforcement learning (RL). Existing approaches such as potential-based reward shaping normally make full use of a given shaping reward function. However, since the transformation of human knowledge into numeric reward values is often imperfect due to reasons such as human cognitive bias, completely utilizing the shaping reward function may fail to improve the performance of RL algorithms. In this paper, we consider the problem of adaptively utilizing a given shaping reward function. We formulate the utilization of shaping rewards as a bi-level optimization problem, where the lower level is to optimize policy using the shaping rewards and the upper level is to optimize a parameterized shaping weight function for true reward maximization. We formally derive the gradient of the expected true reward with respect to the shaping weight function parameters and accordingly propose three learning algorithms based on different assumptions.

reward function, reward shaping, utilize shaping reward, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.63)

Add feedback

Filters

Collaborating Authors

utilize shaping reward

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Learning to Utilize Shaping Rewards: A New Approach of Reward Shaping

Review for NeurIPS paper: Learning to Utilize Shaping Rewards: A New Approach of Reward Shaping

Learning to Utilize Shaping Rewards: A New Approach of Reward Shaping